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Abstract — This paper considers broadcast channels with 
L antennas at the base station and m single-antenna users, 
where L and m are typically of the same order. We assume 
that only partial channel state information is available at 
the base station through a finite rate feedback. Our key 
observation is that the optimal number of on-users (users 
turned on), say s, is a function of signal-to-noise ratio 
(SNR) and feedback rate. In support of this, an asymp- 
totic analysis is employed where L, m and the feedback 
rate approach infinity linearly. We derive the asymptotic 
optimal feedback strategy as well as a realistic criterion to 
decide which users should be turned on. The corresponding 
asymptotic throughput per antenna, which we define as the 
spatial efficiency, turns out to be a function of the number 
of on-users s, and therefore s must be chosen appropriately. 
Based on the asymptotics, a scheme is developed for systems 
with finite many antennas and users. Compared with other 
studies in which s is presumed constant, our scheme 
achieves a significant gain. Furthermore, our analysis and 
scheme are valid for heterogeneous systems where different 
users may have different path loss coefficients and feedback 
rates. 

Index Terms — Broadcast channels, feedback, MIMO 
systems, throughput. 



I. Introduction 

It is well known that multiple antennas can improve 
the spectral efficiency. This paper considers broadcast 
channels with L antennas at the base station and m 
single-antenna users. To achieve the full benefit, perfect 
channel state information (CSI) is required at both re- 
ceiver and transmitter. Perfect CSI at the receiver can be 
obtained by estimation from the received signal. How- 
ever, if CSI at the transmitter (CSIT) is obtained from 
feedback, perfect CSIT requires an infinite feedback rate. 
As this is not feasible in practice, it is important to 
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analyze the effect of finite rate feedback and design 
efficient strategy accordingly. 

The feedback models for broadcast channels are de- 
scribed as follows. To save feedback rate on power 
control, we assume a power on/off strategjQ where each 
user is either turned on with a constant power or turned 
off. For a given channel realization, the users quantize 
their channel states into finite bits and feedback the 
corresponding indices to the base station. After receiving 
the feedback from users, the base station decides which 
users should be turned on and then forms beamforming 
vectors for transmission. 

Broadcast channels with feedback have been widely 
studied recently. Ideally, if the base station has the per- 
fect CSI, dirty paper codes or zero-forcing transmission 
can help clean off interference among users. However, 
with only finite rate feedback on CSI, the base station 
does not know the perfect channel state information 
and therefore interference from other users is inevitable. 
The interference gets so strong at high signal-to-noise 
ratio (SNR) regions that the system throughput is upper 
bounded by a constant even when SNR approaches infin- 
ity. This phenomena, called interference domination, was 
reported on in [1], [2]. One way to combat it is to allow 
the number of users to be much larger than the number 
of antennas at the base station. With sufficiently many 
independent realizations of the channel, it is possible 
to obtain L orthogonal users with feedback: Sharif and 
Hassibi select users whose channel directions are close 
to a random generated basis vectors in [1]; Yoo, et. 
al., pick up near orthogonal users in an iterative way 
[3], [4]. Recently, Bayesteh and Khandani quantified the 
feedback required as a function of the number of users 
[5], [6]. Another approach is to fix both the number of 
antennas at the base station and the system size (the 
number of users). It has been shown in [7] that the 
maximum achievable multiplexing gain is one (at high 
SNR) with finite rate feedback. The full multiplexing 
gain requires the feedback rate increases linearly with 
SNR [2], In both approaches, a homogeneous system is 
assumed where all the users share the same path loss 
coefficient and feedback resource. 

Separate from the above, this paper studies a more 

'This assumption will be further validated in Section II 
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realistic scenario: 

• We consider heterogeneous systems where different 
users may have different path loss coefficients and 
feedback rates. 

• The size of the broadcast system is small. That is, 
the number of users and the number of antennas 
at the base station are typically of the same order. 
Note that a cooperative communication network can 
often be viewed as a composition of multi-access 
and broadcast sub-systems with a small number of 
users. Research on broadcast systems of small size 
also provides insights into cooperative communica- 
tions. 

• Analysis and design are valid for arbitrary SNR. 
According to the authors' knowledge, the above prac- 
tically important scenario has not been systematically 
studied due to the associated difficulty in analysis. 

For such systems, we solve the interference domina- 
tion problem by choosing the appropriate number of on- 
users s. This solution comes from an asymptotic analysis 
where L,m,s and the feedback rates approach infinity 
linearly. As have been demonstrated in [8] and will 
be verified in our simulations in Fig. [T] this type of 
asymptotic analysis is surprisingly reliable when being 
applied to small systems. The main asymptotic results 
include: 

• It is asymptotically optimal to only quantize the 
channel directions and ignore the channel magni- 
tude information. The asymptotically optimal feed- 
back function and codebook are derived accord- 
ingly. 

• A realistic on/off criterion is proposed to decide 
which users should be turned on. 

• The corresponding throughput per antenna con- 
verges to a constant, defined as the spatial effi- 
ciency. It is a function of the normalized number 
of on-users s — j^. Further, there exists a unique 
s E (0, 1) to maximize the the spatial efficiency. 

Based on the insights obtained from the above asymp- 
totic results, we develop a scheme to choose the appro- 
priate s for systems with finite L and m. Simulations 
show that the gain achieved by choosing s is significant 
compared with the strategies where s = L [2]. In 
addition, our scheme has the following advantages. 

• It is valid for heterogeneous systems. 

• The associated computation complexity is low. In 
the proposed scheme, the choice of on-users is in- 
dependent of the channel realization, and therefore 
there is no need to select on-users every fading 
block. The computation complexity is much smaller 
than that of user selection [1], [5], [6]. 

• Only on-users need to feedback CSI, which saves 
the precious feedback resource. 



II. System Model 

Consider a broadcast channel with L antennas at the 
base station and m single-antenna users. Assume that the 
base station employs zero forcing transmitter. Let 7; > 
(1 < i < m) be the path loss coefficient for user i. Then 
the received signal Yi e C for user i is given by 



Wi. 



where hj € C ixl is the channel state vector for user, 
qj G C ixl is the zero-forcing beamforming vector for 
user j, Xj G C is the source signal for the user j and 
Wi € C is the circularly symmetric complex Gaussian 
noise with zero mean and unit variance CAf (0, 1). Here, 
we assume that q]qj = 1 and the Rayleigh block fading 
channel model: the entries of hj are independent and 
identically distributed (i.i.d.) CA/"(0, 1). Without loss of 
generality, we assume that L < m; if L > m, adding 
L — m users with 7, = yields an equivalent system 
with L' = m. 

For the above broadcast system, it is natural to assume 
a total power constraint 



<p- 



Further, we assume a power on/off strategy with a 
constant number of on-users as follows. 

Al) Power on/off strategy: a source X.- L is either 

turned on with a constant power P on or turned 

off. 

A2) A constant number of on-users: we assume 
that the number of on-users s (1 < s < 
m) is a constant independent of the specific 
channel realizations, and thus P on = Here, 
s is allowed to be a function of SNR, which 
distinguishes this paper from [1], [2] where 
s = L always. 
A similar strategy has been demonstrated near optimal 
for single user MIMO systems in our work [8]. Although 
little is known about the optimality of the proposed 
strategy in broadcast systems, we adopt it for two rea- 
sons: first, this strategy has simple implementation and 
similar forms are employed in many practical systems, 
see IEEE802.20 and IEEE802.22 for example; second, 
it saves precious feedback resources on power control. 

The finite rate feedback model is then described as 
follows. Assume that both base station and user i knows 
7^] but only user i knows the channel state realization 
hj perfectly. For given channel realizations hi ■ ■ ■ h m , 

2 There are many ways in which the base station obtains 7^. A simple 
example could be that the base station measures the feedback signal 
strength. 
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an on-user i quantizes his channel into Ri bits and 
then feeds the corresponding index to the base station. 
Formally, let Bi = jh G C ixl | with \Bi\ = 2 R - be a 
channel state codebook for user i. Then the qu antiza tion 

and 



III-A 



function is given by q (hj, Bi) — hi. In Section 
|III-B| we will show how to design q and B respectively. 

After receiving feedback information from users, the 
base station decides which s users should be turned on 
and forms zero-forcing beamforming vectors for them. 
Let A on be the set of the s on-users. The zero-forcing 
beamforming vectors q^'s i £ A on is calculated as 
follow^] Let V^ be the plane generated by 



\h 3 : j G A»\{»}} 



Let Vi be the orthogonal complement of V^ and t be the 



dimensions of Vi. Let Tj G 



-<Lxi 



be a random matrix 



whose columns are orthonormal and span the plane V. 
Then q^ is the unitary projection of h;on Tj, that is, 



Here, if s - 
matrix and 



qi:=TiTjhi/ T.Tjh 
1 and A ou — {i}, Tj is a L x L unitary 

q; = hi/ hi 



III. Asymptotic Analysis 

As to and L are of the same order, we consider the 
asymptotic region where L, to, i?/s — ► oo linearly. 

A. Design of Quantization Function 

Generally speaking, full information of contains the 
direction information Vj := hj/ ||hj|| and the magnitude 
information || hj || . In our Rayleigh fading channel model, 
it is well known that Vj and ||h,-|| are independent. 
Intuitively, joint quantization of Vj and ||hj|| is preferred. 

Interestingly, Proposition [JJ implies that there is no 
need to quantize the channel magnitudes. Indeed, as 
L,m — > oo linearly, all users' channel magnitudes 
concentrate on a single value in probability. 



Proposition 1: For Ve > 0, as L,m 



oo with ? 



TO G 



and 



Pr | max 

Ki<m Li 



Pr 



min — || hi 



> 1 



< 1 



0. 



The proof is given in Appendix [A] It is noteworthy 
that whether the users' channel magnitudes concentrate 

3 Our interpretation of constructing zero forcing beamforming vec- 
tors is different from the traditional one (see [4] for example). We 
adopt the unitary projection because not only does it have an explicit 
geometric meaning but also it provides a nice "isotropic" property, 
which is crucial in proofs (see Appendix [c] and |P| for details). 



or not depends on the relationship between L and to: 
the concentration happens when L and to are of the 
same order. To fully understand Proposition [T] it is 
important to realize that the Law of Large Numbers 
does not imply that all users' channel magnitudes will 
concentrate uniformly. The Law of Large Numbers says 
that j- 1 1 hi 1 1 — > 1 almost surely for any given i. However, 
if to approaches infinity exponentially with L, there 
are certain number of users whose channel magnitudes 
are larger than others', and therefore it may be still 
beneficial to quantize and feedback channel magnitude 
information. Formally, consider a broadcast channel with 
71 = • • • = 7 m = 1. As L, to — > oo with log (to) /L — > 
to' G M + , there exists an e > 0, Si > and S2 > such 
that 



and 



log 



log 



ii 

1 

L 



> 1 



Ml < 1 



Si, 



in probability. The proof follows from the standard large 
deviation technique and is omitted here. 

Proposition [1] implies that it is sufficient to quantize 
the channel direction information only and omit the 
channel magnitude information. For this quantization, 
the codebook is given by Bi — {p G C ixl : ||p|| = l} 
with |Bj| = 2 Ri . Let Vj = hj/||hj||. The quantization 
output is given by 



p t = q (hi,Bi) 



arg max 
P eBi 



vjp 



(1) 



B. Asymptotically Optimal Codebooks 

Consider design of codebooks. Given the quantization 
function ([TJ, the distortion of a given codebook B L is the 
average chordal distance between the actual and quan- 
tized channel directions corresponding to the codebook 
Bi and defined as 



D(Bi) :=1-E h . 



max 



vjp 



The following lemma bounds the minimum achievable 
distortion for a given codebook rate. 

Lemma 1: Define D* (R) = inf D(B). Then 



L - 1 
L 



B: \B\<2 R 



2-— (l + o(l)) < D* (R) 



< 



L - 1 



2-— (l + o(l)), 



(2) 



and as L and R approach infinity with 

lim D* (R) = 2- f . 



r G 
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The following Lemma shows that a random codebook 
is asymptotically optimal in probability. 

Lemma 2: Let i3 ran d be a random codebook where 
the vectors p £ 2? r and' s are independently generated 
from the isotropic distribution. Let R = log |i3 ran d|- As 
L,R^oo with f -> f £ R+, for Ve > 0, 



and 



lim 

(L,R)- 



Pr {S rand : D (B rand ) > 2" f + e} = 0. 

The proofs of Lemma [T] and [2] are given in our paper 
[9]. Due to the asymptotic optimality of random code- 
books, we assume that the codebooks B»'s i = 1, • ■ • ,m 
are independent and randomly constructed throughout 
this paper. 

C. On/off Criterion 

After receiving feedback from users, the base station 
should decide which s users should be turned on. 

Ideally, for given channel realizations hi, - - ,h m , 
the optimal set of on users A* n should be chosen to 
maximize the instantaneous mutual information. How- 
ever, finding A on requires exhaustive search, whose 
complexity exponentially increases with m. 

A suboptimal option is the random orthonormal beams 
construction method in [1]: the base station randomly 
constructs L orthonormal beams t>i, ■ • • , b^, finds the 
users with highest signal-to-noise-plus-interference ra- 
tios (SINRs) through feedback from users, and then 
transmits to these selected users. Note that the maximum 
SINR achievable for user i is max h]hi. . However, 

l<k<L 1 

Proposition [2] below shows that in our asymptotic region 
where L, m — > oo linearly, all users' channels are near 
orthogonal to all of the L orthonormal beams b^'s. 
Therefore, all users' maximum SINRs approach zero 
uniformly in probability, and no user should be turned on 
in probability. The method in [1] fails in our asymptotic 
region. 

Proposition 2: Given Ve > and any L orthonormal 
beams b^ £ C ixl 1 < k < L, as L, m — > oo linearly 
with & -> m e K+ 



1 

max — 

Ki<m, Kk<L L 



h-!b fc 



> e 



0. 



lim Pr 

(L,m) — >oo 

Proof: See Appendix [B| ■ 
In this paper, we take another approach where the 
on/off decision is independent of channel directions. We 
start with the throughput analysis for a specific on-user 
i £ Ao n . Note that 

Y t = VT^lq^i + ( V^U h l + W 

V jeA OXL \{i} 

The signal power and interference power for user i are 
given by 
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;7i 



(3) 



;7» 



E 

jeA aD \{i} 



(4) 



respectively. If the choice of A on is independent of 
the channel directions Vj's, we have a nice property 
regarding to P sig .i and P inM . 

Theorem 1: Let \A on \ — s be chosen independently 
of v/s. Let L,m, s, R/s — > oo with ™ — * m £ K + , 



s £ [0, 1] and 



— > G 



Assume that v»'s 



i € A on are independent. Then for Vi £ A on , 
P a ls,i - |7i (1 - 2" f *) (1 - S) , 

•Pjnt.i P7i 2_ri ) 

and therefore the throughput of user i satisfies 



Zi := log 1 



Slg,4 



log 1 + 



1 



(5) 



1 + "int,i 

in probability, where 

>= P7»(l-2- f Q 

774 ' 1 -t^7i2-I« ' 
Proof: See Appendix |Cj and [D] 

Theorem [T] shows that if A on is independent of v,'s, 

Ti is a function of r/i but independent of the specific 

channel realization in probability. Based on this fact, 

we select the set of s on-users A on such that |^4 n| = s 

and 

A on = {i ■ m > Vj for Vj ^ A on } ; (6) 

if there are multiple candidates, we randomly choose 
one of them. It is the asymptotically optimal on/off 
selection if the on/off decision is independent of the 
channel direction information. The difference between 
the throughput achieved by optimal on/off criterion (re- 
quiring exhaustive search) and the proposed (|6]l remains 
unknown. 

D. The Spatial Efficiency 

We define the spatial efficiency (bits/sec/Hz/antenna) 

as 

T(s) := lim 1 [L \ 

where L, m, s, R/s — > oo in the same way as before, 
Z( L ) is the average throughput per antenna given by 



:= E Bi ' B ,h i ' B 



int,i 



and A on , P s i g ,i and P- m t.i are defined in (|6jl, Q and (|4]) 
respectively. 

We shall quantify X (s) for a given s. Define the 
empirical distribution of r\i as 
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and assume that p v := lim/i^ m - ) exists weakly as 
L, m, R/s — ► oo. In order to cope with /i j; 's with mass 
points, define 

/>oo /"OO 

/ / (v) d^ri ■= lira / / (77) 

for Vx € E, where / is a integrable function with respect 
to p v . Then X (s) is computed in the following theorem. 

fh , 



, s and — * r*j. Define 



Theorem 2: Let L,m, s,P/s — > 00 with ^ 



?7s := sup ^77 : m 

Then as s ^ (0, 1), J(s) = 0. If s G (0, 1), 

1 - s s 



X (s) = 771 / log 1+7/ 



^77 lo S 1 + V. 



1 



(7) 



Proof: It actually follows from Theorem [T] ■ 
We are also interested in finding the optimal s to 

maximize X (s). Though X (s) is not a concave function 

in general, the following theorem provides a criterion to 

find the optimal s. 

Theorem 3: X (s) is maximized at a unique s* G 

(0, 1) such that 



G 



lim inf 

As^O 



Z(s*) -I(s* - As) 
X(s*)-X(s* - As) 



(8) 



lim sup 

As^O As 

The proof is in Appendix|E] The corresponding X (s*) 
is the maximum achievable spatial efficiency for the 
proposed power on/off strategy. It is noteworthy that s* 
is not a monotone function of SNR p according to our 
empirical calculation. 



IV. Finite Dimensional System Design 

Based on the above asymptotic results, we now pro- 
pose a scheme for systems with finite L and m. 



A. Throughput Estimation for Finite Dimensional Sys- 
tems 

While asymptotic analysis provide many insights, we 
do not apply asymptotic results directly for a finite 
dimensional system. The reason is that in asymptotic 
analysis j- — > while > for finite dimensional 
systems. To see the difference more explicitly, let us 
calculate the main order term of the throughput for user 



i G A 
is 



For user i G A on , the corresponding throughput 



X,=E 



log 1 



log 1 



E 



log 



P ■ 

+ E [P inM ] 
1 



P 

Slg,2 



Pi 



1 



E[Pi 



int.il 



E 



log 



1 



where P s ; 



E [P sig ,i] 

f Pint,i 

1 + E [P inM ] 

slgiI and Pi„t,i are defined in Q and Q. We 
quantify E [P S i g ,i] and E [p n t,i] in below. 

Theorem 4: Let Bi's be randomly constructed and 
Di = E e! [D {Bi)\ for all 1 < i < m. For randomly 
chosen A on and i G A on , if 1 < s < L 

L " 



E[P si 



+P« 



(i-A) 1 



1 



L(L-l) 



and 



E[P ir 



L s — 1 



s L-r 



-D, 



(9) 



(10) 



if s > L, E [P sig)i ] = 0. 

The proof is provided in Appendix [C] and |D] Define 

E [Pig,,] 



log 1 



1 + E [P int ,] 



(ID 



It can be verified from Theorem [T] that X L = I ma in.i + 
o(l) and therefore X ma i n i is the main order term of 
X{. Then the difference between asymptotic analysis and 
finite dimensional systems analysis is clear. In the limit, 
— > s and — > rj . However, for finite dimensional 
systems, simply substituting the asymptotic values into 
(9][TTi directly introduces unpleasant error, especially 
when L is small. Therefore, to estimate Xi (Vz G ^4 n) for 
finite dimensional systems, we have to rely on ([9]>-([TT]>. 

The calculation of E [P S i g ,i] and E [Pj n t i] relies on 
quantification of Pj. In general, it is difficult to compute 
Di precisely. Note that the upper bound in (|2]i is derived 
by evaluating the average performance of random code- 
books (see [9] for details). We use its main order term 
to estimate Dc 



r 



Di 



1 

L-l 



L- 1 



B. A Scheme for Finite Dimensional Systems 

Given system parameters, a practical scheme finding 
the appropriate s and ^4 on is developed. 

For a given s, the set of A on is decided as follows: 
calculate J main ,i,--- ,2" m ain,m according to (fTT) and 
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choose the s users with the largest I ma i n ,i's to turn on; if 
there exists an ambiguity, random selection is employed 
to resolve it. For example, if I ma in,i = Imam, 2 = • • • = 
X m ain,m> me s on-users are randomly drawn from all 
the m users. Note again, that A on is independent of the 
channel realization. 

The appropriate s is chosen as follows. Let 



is) 



max 

„: \A on \=s 



We choose the number of on-users to be 

s main = ar § max ^main ( s ) ■ 

l<s<L 

Although the above procedure involves exhaustive 
search, the corresponding complexity is actually low. 
First, the calculations are independent of instantaneous 
channel realizations. Only system parameters L, m, 7;'s, 
Ri's and p, are needed. Provided that 7;'s change slowly, 
the base station does not need to recalculate s^ ain and 
A on frequently. Second, Ri = Rj in most systems. For 
such systems, the s on-users are just simply the users 
with the largest 7j's. 

After calculating s* lain and A on , the base station 
broadcast A on to all the users. For each fading block, 
the system works as follows. 1) At the beginning of 
each fading block, the base station broadcasts a single 
channel training sequence to help all the users estimate 
their channel states h/s. 2) After estimating their h;'s, 
the on-users quantize h;'s into p/s according to (jXJ and 
feed the corresponding indices to the base station. 3) The 
base station then calculates the transmit beamforming 
vectors q/s and transmits q^X^'s. 

Remark 1 (Fairness Scheduling): For systems with 
7i 7^ 7j or Ri 7^ Rj, there may be some users always 
turned off according to the above scheme. Fairness 
scheduling is therefore needed to ensure fairness of the 
system. An example could be as follows. Given m users, 
calculate the corresponding s* lain and A on , and then 
turns on the users in A on for the first fading block. At 
the second fading block, only consider the users who 
have not been turned on {1, • • • , m} \^4 on . Calculate the 
corresponding s* nain and A on , and then turns on the users 
in the new A on . Proceed this process until all users have 
been turned on once. Then start a new scheduling cycle. 

C. Simulation Results 

Fig. [TJ gives the simulation results for the proposed 
scheme using zero-forcing. In the simulations, L = m = 
4. For simplicity, we assume that 71 = 72 = • • • = 
7 m = 1 and Ri = R% = ■ ■ ■ = R m = i?fb. With these 
assumptions, the s on-users can be randomly chosen 
from all the m users. Without loss of generality, we 



L-4, m-4, Rfb-6 Bits/Channel Realization 

- Simulation : l{s main ) 

- Theor. Gal. : L^s* _1 

■ Simulation : l(s) with fix s 




E 0.5 



s =2 



5 10 15 

SNR (dB) 

(a) Rfb = 6 Bits/Channel Realization 

L=4, m=4, Rfb-12 Bits/Channel Realization 

Simulation : l{s main ) 

Theor. Gal. : I . {s . ) 




assume that A Q 



{!,-■■ ,s}. LetX(s) = E 



SNR (dB) 

(b) i?fb = 12 Bits/Channel Realization 
Fig. 1. Total Throughput for Zero Forcing Beamforming 



In Fig. [TJ the solid lines are the simulations of X (s main ) 
while the dashed lines are the theoretical calculation 
of Imain (s ma j n ). The simulation results show that the 
optimal s is a function of p and R^. For example, 
s = 1 is optimal when p £ [15,20]dB and i?fb = 6 
bits, while s = 3 is optimal for the same SNR region as 
Rfb increases to 12 bits. The reason behind it is that 
the interference introduced by finite rate quantization 
is larger when i?fb is smaller: when i?fb is small, the 
base station needs to turn off some users to avoid strong 
interference as SNR gets very large. 

We also compare our scheme with the schemes where 
the number of on-users is a presumed constant (in- 
dependent of p and i?fb). The throughput of schemes 
with presumed s is presented in dotted lines. From the 
simulation results, the throughput achieved by choosing 
appropriate s is always better than or equals to that with 
presumed s. Specifically, compared to the scheme in 
[2] where s = L = 4 always, our scheme achieves a 
significant gain at high SNR by turning off some users. 

It is interesting to observe that given feedback rates, 
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the optimal number of on-users s* nain is not monotonic 
with SNR p. As discussed before, s^ain = 1 as P ~~ * 00 
to avoid the interference domination phenomenon. While 
p — ► 0, it can be shown that s* nir , = 1 as well. For this 
case, compared with the noise power, the interference is 
weak and can be ignorable. Setting a = 1 avoids signal 
power loss (the — § ^ term in (|9j) due to zero forcing 
projection and therefore is optimal. For median SNRs, 



we have to rely on the scheme in Section IV-B 



V. Conclusion 

This paper considers heterogeneous broadcast systems 
with a relatively small number of users. Asymptotic 
analysis where L, m, s, Ri — > 00 linearly is employed 
to get insight into system design. We derive the asymp- 
totically optimal feedback strategy, propose a realistic 
on/off criterion, and quantify the spatial efficiency. The 
key observation is that the number of on-users should 
be appropriately chosen as a function of system param- 
eters. Finally, a practical scheme is developed for finite 
dimensional systems. Simulations show that this scheme 
achieves a significant gain compared with previously 
studied schemes with presumed number of on-users. 



Appendix 
A. Proof of Proposition [7] 

This proposition is proved by standard large deviation 
argument. Note that ||hj||'s (i = 1, • • • , m) are indepen- 
dent and identically distributed. 



IV I max \ ||h j; || 2 > 1 + e 

l<i<m Li 



J - I l'v| i||hi|| 2 <l + e 
1 - rxp { mlog (l - Pr f i ||hi|| 2 > 1 + e 



For all a € (0, 1), by Chebyshev's inequality, 



,a(|/, M | a -l) 



)} 



< exp { — L (ae — logE 
= exp {—L (a (1 + e) + log (1 -a))}. 



Take a = -A^. We have 

Pr ("^llhill 2 > 1 + <exp{-L( e -log(l + e))}. 



Let /+ (e) := e — log(l + e). It can be verified that 
/ + (e) > for e > 0. Thus, for any given 5 > 0, if L is 
sufficiently large, 

Pr ( max - llhj 2 > 1 + e 

\l<i<mL 

< 1 - exp (ml (1 + o (1)) log (1 - exp (-£/+ (e)))) 
= 1 - exp (-mLe~ Lf+{e) (1 + o (1))) < 5, 

which proves the first part of Proposition [T] 

The second part is proved similarly. For any given 

S > 0, 



Pr f min -— 1 1 h.^ 1 1 2 < 1 — e 

Ki<m L/ 



I - (, -M> { "' ! ' Pr ( v llhjll 2 < 1 — e 



(a) 
< 

(b) 



< 1 - exp {mlog (l - e -£(a(i-*)+log(i- e )A } 

1 - exp {mlog (l - e~ i( - Ml-*)-*)) J 

( = } 1 - exp {-rnLe-^- 10 ^ 1 -^- 6 ' (1 + o(l))} < <5, 

where (a) holds for all a S (—1,0) (by Chebyshev's 
inequality), (6) follows from setting a = —jz^, and 
(c) follows from the fact that — log (1 — e) — e > for 
e G (0,1) and the Taylor's expansion of log (1 — x). 

B. Proof of Proposition [2] 

This proposition is based on the observation that 
Ihjbfc - CM (0,1) are i.i.d. (1 < i < m, 1 < k < L). 
Let B = [bi-'-b^]. Then the above observa- 
tion is verified by E [(B+hi) (B+hj)] = I, and 
E [(Bthi) (Bthj)] = for i ^ j. Note that 



Pr 



> Le 



-Lc 



For any given 6 > 0, as L is sufficiently large, 

1 ■ 2 



Pr I max 

, Ki<m,Kk<L L 



hjb fc 



> e 



1 - Pr 



> Le 



1 -exp\ ffiL 2 (l + o(l)) 



hlbi 



> Le 



■ log \1 - Pr 
= 1 -cxp{-mi 2 e" L£ (l + o(l))} < 5, 
which completes the proof. 



s 



C. Signal Energy Calculation 

The signal power can be written as 



hlqi 



[pipi] [pip^fqi 



L 



L 

- (hjpiptqi) (q\pi (pi) hi 



f (h[pi (pif qi) (qjpiplhi 



L 

1) Asymptotic Analysis: Here, we prove that 
^hjqiqjhi — > (1 — s) (1 — 2'' 1 ). It is an application 



of the following Lemma |3][5 
Lemma 3: ~ hjpipjqi 
Proof: We claim that 



(1 - g) (1 - 2"H 



h f lP i 



in probability. It is follows from the facts that \ ||h| 
1 in probability and that vjpipjvi 
[2). We shall show that 



1 — 2 r (Lemma 



pjqi 



l 



in probability. Note that p x and Ti are isotropically 
distributed and independent. The statistics of Tjpi is 
the same as that of 



1 



\\h\\/VLVL 

where h' 6 C Lxl is a random Gaussian vector with 
independent CAf (0,1) entries. Note that Ti has rank 
L—(s — 1) with probability one. Tjh' contains L — s+1 
i.i.d. CKf (0,1) entries with probability one. It follows 
thati||h'|| 2 ^l, 



Tth' 



and 



T f lP i 



in probability. Hence, 

2 



plqi 



plTiTJp! -» 1 - s 



in probability. 

Lemma 4: 



L 



h\pi (pi) f qi 



in probability. 



Proof: Suppose that pi is given. Without loss of 
generality, assume that pi = [1, 0, • • • , O] 1 ^] Let 



w := P] 1 (p^hi/ 



Pi" (p^V 



be the unitary projection of hi on p^. Then w has the 
form [0,101, • • ■ , ifli-i] . We shall show it is invariantly 
distributed under the rotation 



ul 



1 



u 



L-l 



as follows. Let 

Hi :={hi : q (h x ,Bi) = Pl } . 
Note that for any <E 

q(Uihi,UiS) =Uipi=pi. 

Hi is invariantly distributed under U\. Further, p^ is 
also invariantly distributed under U\. Since w is nothing 
but the unitary projection of hi on p^, w is invariantly 
distributed under U\ (see also [10]) . Hence, the statistics 
of w is the same as that of 

y/L=l 1 



\L-1| 



o 



1 L-1 



where h' L _ 1 6 C^ L_1 ^ xl is a random standard Gaussian 
vector. It can be verified that for any given q € C Lxl 
with unit norm, 

1 



0,h' 



in probability. Now note that 



L-l 







pi (prfhl 



and 



Ih' r 



[L^l 



in probability. This Lemma is proved. 
Lemma 5: 

- (hjpipjqi) (q\pi (pif hi 

in probability. 

Proof: It follows from that 



and 



1 
L 

1 

L 



hlpiplqi 



c < oo 



q\pi (pi) f hi 



in probability. 



If pi does not have the claimed form, we then apply the rotation 
[pip^] for some p^ to hi , ■ ■ ■ , h s and B\, • • • ,B 3 . This rotation 
gives p'j = q(h' 1 ,B^) = [1,0, ■■■ , 0]^ but will not change the 
analysis. 
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2) Finite Dimensional Analysis: For finite dimen- 
sional system, we shall show that 



E 



yhlqiqjhi 



L 



l 



L(L-iy 



This result is proved by combining Lemma 
Lemma 6: Given pi, 



Ti 



Furthermore, 



E 



hi, Hi 



Proof: Given pi, 



hlpi 



L-s+1 



= 1 - . 



= p t 1 T 1 T t lPl . 



Note that T x e Ulx(l-s+i) with probability one, 
is isotropically distributed and independent of pi. By 
arguments on the Grassmann manifold [9], it can be 
verified that 



Eq 



Lemma 7: Given pi and p^, 



E 



Ti 



(p^^iqlPi 1 



1 



L(L-l) 
Proof: For any given V e Ul-i, let 



f-L-l- 



U = [pp x ] 



1 



V 



pp 



it 



(12) 



Then U G U L , Up- 1 = p^V and Up = p. Let T e 
Wi X (i_ s+1 ) be isotropically distributed and independent 
of p and p- 1 . Then 



Ex 



(«) 



TTt, 



TT tr 



E 



UT 



TTtp|| VjTTtp 
nt TTtp 



(P X ) 



|TTtp|| 



= E 



UT 



— Ex 



(Up 
(Up 



M t UT(UT)'Up 



UT (UT) f Up 
M t TTtUp 



V f E n 



TTtUp| 
TTtp 

||TTtpj| ■ 



(P X ) 



(13) 



where (a) follows from the fact that T is isotropically 
distributed and therefore g?/ix = d/iuTi and (b) follows 
from the variable change from UT to T. Since (jT3j is 
valid for arbitrary V G Ul-\, Ex [•••] = cl for some 
constant c > 0. 



We calculate 

it 



as 



follows. Note that 



q f [PP X ] [pp x ] T q = 1. Then 
1 



L - 1 
1 



L-l 



tr 



p ) qq T p 



q f p x (p x ) f q 



l - 



L-s+l 



L - 1 



s - 1 



Lemma 8: Given px and p^, 



E 



hi,8i 



(P^^rhtp^ 



D 



LL-l- 



Proof: For an arbitrary V S Ul-i, l et U S Z/l be 
in ([12]). Then 

q (Uh^UBx) - Uq (hi, Bi) = U Pl = Pl . 

By following the same idea of the proof of Lemma [7] 
this lemma is proved. ■ 
Lemma 9: Given p! and p^, 



E 



plqiqlpr 



ot 



and 



E 



hi.Bi plhihjpj 1 =0 f . 
Proof: By the same method in Lemma [7] and [8] 
for an arbitrary V e Ul-i, E [■ ■ ■ ] = E [• • • ] V, which 
holds if and only if E [•••]= 0* . ■ 

D. Interference Power Calculation 

The interference from user j to user 1 can be written 
as The signal power can be written as 



t h Iqj' 



h| [pipr] [pip x ] q? 



hlpr (pr) f qj 



where the last step follows from the construction _L 
Pi. The total interference at user 1 is then 



i^lhlp^p^q, 



J=2 



1) Asymptotic Analysis: Without loss of generality, 
assume that p = [1,0, ••■ ,0]^. We have analyzed the 



property of h\pi (pi)^ in the proof of Lemma|4| It has 
been shown there that the statistics of ^hjp^ (pi~) 

where 



lyzea 

aQ it: 



is the same as that of X^- 



X, 



h\pi ( P r) f 



1 L-l| 



in probability and h! L _ 1 e C' L 1 ) xl is a standard 
Gaussian vector. Now for any given 2 < j < s, since 
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_L p! and ||qj|| = 1, the statistics of q., is the same 
as that of 



where h^_ 1 S C' L_1 ' xl is another standard Gaussian 
vector, h' L _ 1 qj ~ CM (0, 1). It then can be verified that 

1 s 

in probability. Therefore, the total interference converges 
to s2~ r in probability. 

2) Finite Dimensional Analysis: It can be shown that 
the average interference power is 



L 



-tr E 



(ptfEhl)pi 



i=2 

Given pi and p^, it can be shown that 

s-1 



5>i 

3=2 



s.B'.s 



(pi) qj-qjpi" 



a - 1. 
L-l 



by the same technique in the proof of Lemma [7] Com- 
bining this fact and Lemma [8] calculates the average 
interference power. 

E. Proof of Theorem [3] 

Here, we only prove Theorem |3]by assuming that d/j, v 
contains no mass point. The proof for dn n containing 
mass points follows the same line but is much more 
complicated and omitted due to the space limitation. For 
compositional convenience, we use the following nota- 
tions: /(„,*) = log (1 + ^1=*), f(r,,s) - dfM 



ds 



and y (s) = J. f (r], s) dfj, v where t is given by 
inf {t : f t °° d/ii] < s } . When dfj. n contains no mass 
point, fifty is reduced to = y' (s). To proceed, we need 
Lemma TlQlfL?! in below. 
Lemma 10: 

V (s) := = / (t, a) + J f (rj, s) dfi v . 

This lemma is proved by elementary calculation. 
Lemma 11: If y' (s) = implies y" (s) < on (0, 1), 
then one of the following three cases must be true: 

1) y' (x) > on (0, 1) and sup y (s) = limy (s); 

se(o,i) 

2) /' (x) < on (0, 1) and sup y (s) = limy (s); 

86(0,1) 

3) there exists a unique s* £ (0, 1) such that y' (a*) 
0, and sup y(s)—y(s*). 

86(0,1) 

Proof: Since the first two cases are trivial, we only 
prove the third case. We shall prove that there exists 



a unique s* E (0,1) s.t. y' (a*) = 0. The existence is 
clear since we have excluded the first two cases. The 
uniqueness is proved by constructing a contradiction. 
Suppose that there are £ (0, l)'s s.t. y' (z,) = 0. Take 
the largest z\ < s* . Since y" (zi) < and y" (a*) < 0, 
there exists a S < x *^ Zl s.t. y' (z) < on {z u zi + 5) 
and y' (z) > on (s* —6,8*). But this implies that 
there exists z' e \z\ + 5, s* — S] s.t. y' (z') = 0, which 
contradicts the assumption that zi < s* is the largest 
root of y' (z). ■ 
Lemma 12: 



2x 



l + x 

for all x > 0. 
Proof: Let 



+ log 2 (1 + ar) -2 log (l + x) > 



1 



+ log 2 (l+x) -2 log (l + x) . 



Since g (0) = 0, this lemma is true if g' (x) > for 
x > 0. Note that 



</(*) 



Iog(l + a)- 



1 



> 



l + x 

We have g' (x) > on x > if 

<7 (x) = log (l + x)- — 

on x > 0. Since g (0) = and g' (x) 
x > 0, g (x) > on x > 0. This lemma is proved. 

In order to prove Theorem[3] as the first step, we show 
that there exists s* e (0, 1) s.t. y' (a*) = 0. Note that 



(i+xY 



> on 



y' (s) = log 1 + 1 



1 - a 



1 



It is easy to verify that lim s _>i y' (s) < 0. Now let s 
0. Since 



1 



>1 



s > 1 



n 



< 2 as s < -. 



s s 



But 



lim log 1 + t 



1 



Then lim s ^o y' (x) > 0. We conclude that y' (a*) = 
happens for some s* £ (0, 1). 

According to Lemma [TT] it is sufficient to prove that 
as a S (0, 1), y' = implies y" = 0. Set y' = 0. Then 



log ( i + ^^T ' " ' 



i-v 

l-s 



dpi-q 



l + r\ 



= 0. 
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Now we calculate y" 



/(*,*)+ J f{r,,s)dn n \ 

pOO 

2/' (*,«)+/ f'(v,s)dfi 
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S 

where (a) comes from the fact that t- > t^—^- and 
Jensen's inequality, and (6) is from the assumption y' = 
0. Note that > 0. By Lemma [Hj y" < 0. The 

x* e (0,1) s.t. y' (s*) = is therefore unique and 
maximizes y. 
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